AI for identifying social norm violation

Identifying social norms and their violation is a challenge facing several projects in computational science. This paper presents a novel approach to identifying social norm violations. We used GPT-3, zero-shot classification, and automatic rule discovery to develop simple predictive models grounded in psychological knowledge. Tested on two massive datasets, the models present significant predictive performance and show that even complex social situations can be functionally analyzed through modern computational tools.


Norms, norm violation, and social emotions
Social norms cover many behavioral and moral norms that may vary across different groups and levels of granularity, from analyzing human society to personal norms. There seems to be an underlying universality about social norms 12 where it was found, according to the "morality-as-cooperation" theory that the moral valence of seven cooperative behaviors is uniformly positive in 60 different cultures. Regardless of this universality, one may wonder whether the proposed methodology based on culturally specific textual corpora can generate universal conclusions. Our response to such a possible critique is twofold. First, we have a limited purpose which is mainly methodological: To develop and test an AI-based methodology that can identify the violation of social norms rather than the norms themselves. This methodology is generic and can be potentially adapted to different cultures. We do not intend to produce universal findings nor develop a methodology for identifying social norms in different cultures. We focus on identifying violations as we found that while norms may significantly vary across cultures, the repertoire of emotions used to respond to these violations is limited enough. As social emotions and their dynamic social aspect may be successfully studied across cultures, such as American-English and Chinese-Mandarin 13 , a constructive strategy for identifying the violation of social norms is to focus on a limited set of social emotions signaling the violation.
Let us elaborate on this important idea. As the number of social norms may be enormous, a simple and natural way of learning norms is through a limited number of social emotions that are evolutionary grounded and deeply associated with a universal valuation system of human beings 14 .
For example, when people feel shame, embarrassment, or regret, it is hypothesized that they acknowledge the violation of a social norm. Therefore, the violation of a norm is accompanied and signaled by social emotions that have an important function in the "recalibration of social evaluation in the minds of self and others" 15, P. 292 . These emotions are universal, although their particular expression may be culturally grounded. As explained by Billig about embarrassment 16, P. 219 : "Embarrassment, then, can be seen to possess a universal role in supporting the moral order of everyday life, whatever the nature of moral order". The association between embarrassment and social norm violation is empirically supported 17 , as well as the association between norm violation and other negative social emotions, such as shame, guilt, and regret, that were shown to be consequential to norm violation 18 . These consequential social emotions seem to be evolutionary grounded in mechanisms of social devaluation 19 and have a clear function in supporting cooperation 20 . It is therefore hypothesized that certain social emotions may indicate a social norm violation and may be used for this paper's main task, which is the classification of cases involving norm violation/confirmation. Inspired by the abovementioned theorization, the outlines of the paper are: • First, using zero-shot classification, we tested our ability to automatically identify social emotions in short textual data. • Next, we used GPT-3 for generating synthetic data and identifying violated social norms through human domain expertise. • Relying on the outcome of GPT-3, we identified a high-level taxonomy of norms represented by ten top-level categories. • In the third phase, we used a massive dataset of textual data, and measured social emotions and norm violations. We used features measuring social emotions, norm violation, and two other simple features for automatic rule discovery. • Using this approach, we identified seven simple models (i.e., mathematical functions/rules) that can be used for classifying cases involving norm violation/confirmation. • In the two major experiments, we used simple models and applied them to classify norm violations in two other massive datasets. Finally, we test our ability to identify which norm has been violated.

Using zero-shot classification for identifying social emotions
This paper relies on zero-shot classification for identifying norm violations. Zero-shot classification addresses classification as a form of natural language inference (NLI). The task is to compute the probability that each class label can be automatically inferred/entailed from a textual premise. For instance, given the text "John failed in his studies" and a two-label class {bad, good}, the zero-shot classifier that we used in this study 9, accessed August, 2022 inferred that the label "bad" is more reasonably entailed from the text (with a confidence level of 99%) than the label "good" (1% confidence). It is important to understand that a zero-shot classification predicts classes that the model didn't "see" during its learning phase, which is why it is titled "zero-shot". Zero-shot classification involves a deep neural network that has already been trained. We as users, don't train the classifier but use it for classification. We insert a text as an input, several words (or higher textual units) as "labels" and ask the machine to classify the input to each label. Here is another example. We use as input the following sentence: "I had a car accident". As labels, we use two words, "good" and "bad". The classifier "guesses" that the correct label is "bad" with a probability of 0.823. While zero-shot classification usually involves units of complete semantics (e.g. sentence), we have decided, similarly to 21 , to use as class labels the class names only. We wanted to avoid the complexity of forming whole utterances and focus on the simplest possible labels. We are currently experimenting with the engineering of class labels composed of semantic composites such as sentences and have some powerful results supporting the methodology 22 .
In this paper, we have mainly used two sets of features/variables: (1) social emotions and (2) social norms. We first tested the performance of a zero-shot classifier in identifying five basic social emotions.

Method
Using zero-shot classification for identifying social emotions. To test the zero-shot classifier's performance, we adopted 25 short scenarios used by 23 to measure social emotions. For each scenario, we produced the classification scores of the five social emotions used in their study: pride, gratitude, guilt, anger, and sadness. In other words, for each scenario, we produced five scores using the zero-shot classifier. We identified the emotion scored at the top by the classifier and compared it to the emotion scored at the top by the human subjects in 23 . We hypothesized that if the zero-shot classification may label the emotion in the text in agreement with the dominant emotion identified by the human judges in 23 , then the classifier can be used for successfully identifying social emotions in short texts.
Results of testing the zero-shot classification in identifying social emotions. We identified the social emotion that on average scored the highest by the human judges in 23 . Next, we applied the zero-shot classifier that produced five scores for the five social emotions and identified the emotion scored highest by the classifier. The baseline for predicting social emotion in the scenario under full ignorance is p = 0.2, as for each scenario, the classifier should arbitrarily guess one of five social emotions. However, the zero-shot classifier's topidentified emotion was identical to the human subjects' top-identified emotion in 64% of the cases. Given the assumption of full ignorance and the classifier's random guess, the Binomial distribution shows that such a level of success in prediction is highly unlikely (p = 0.0001797). This result provides preliminary support for the classifier's ability to identify social emotions in short textual data, even if we use single emotion words as class labels. We want to clarify that the 64% score is not the score of correct answers but the percentage of cases in which the zero-shot classifier identified the same label as the leading label produced by the human raters in 23 . The human raters in 23 are, therefore, our gold standard of truth for testing our system. In this context, we should ask how likely it is to produce a random guess of 64%. For testing this hypothesis, we used the Binomial distribution and showed that gaining such a level of success by chance is highly unlikely, with a probability of 0.0002 (rounded).

Identifying categories of social norms.
For the automatic identification of norm violation, we used situations labeled by social emotions associated with norm violation (e.g., shame). We used these situations to identify the underlying social norms automatically and to categorize them into a few top-level categories. Specifically, we have used the EmpatheticDialogues dataset 24 . This is a dataset of 25 k conversations grounded in situations, each labeled according to one of 32 emotion labels. First, we selected situations labeled by the social emotions of embarrassment, shame, and guilt (N = 4791). Next, we used GPT-3 and, for each of the situations, used the following general template for automatically generating the social norm that has been violated: The code gets as an input a situation (e.g., I felt like a fool when …) and its labeled emotion (e.g., ashamed). The template is formed as a dialogue between a speaker and a listener. First, the speaker presents the situation (e.g., "Man when I first lost …"). The Listener then asks the speaker how s(he) felt in the situation, the speaker's answer is formed in such a way that the labeled emotion is built into his answer ("I felt ashamed"), and GPT-3 is asked to complete the sentence and to explain the emotion. The GPT-3 first completion is fed back into the dialogue as a prompt and the listener asks GPT-3 to explain the social norm violated by completing the sentence "The social norm that I violated is …". GPT-3's second completion aims to expose the social norm violated. We run this procedure on 4971 situations with the temperature parameter set to 0.7, and qualitatively analyzed the outcomes of a sample of 300 situations to identify top-level categories of norms. Based on the qualitative analysis and expertise of the first author, the identified norms were: www.nature.com/scientificreports/ 8. Conformity: The valued individual is aware of social norms and respects social norms, specifically in the public sphere. 9. Decency: The valued individual associates with other valued individuals and maintains social ties with valued individuals. 10. Loyalty: The valued individual is loyal to her relevant social groups (e.g., a football team, ethnic group, nation) These norms/dimensions have been translated into the following oppositional two-class categories as described in Table 1: Testing the zero-shot classification in identifying norm confirmation/violation. We used the EmpatheticDialogues dataset and in addition to situations labeled with guilt, embarrassment, and shame, identified situations in which the subject felt proud or grateful (N = 8886).
For example, the following situation is labeled as involving shame: "After learning my husband of only 4 weeks was cheating on me I did the same to him. Not only was it before our marriage but I had found texts of him contacting her for sex after. To make it worse, I slept with his close family friend".
And the following situation involves the emotion of pride: "I am very happy to have been first over 300 students during this years at my enginering school" Each situation has been used as an input to the zero-shot classifier. For each situation, the classifier used each of the ten dimensions by using as labels the two opposing poles of the dimension (e.g., polite vs. impolite). For example, we used as input the situation: "After learning my husband of only 4 weeks was cheating on me I did the same to him. Not only was it before our marriage but I had found texts of him contacting her for sex after. To make it worse I slept with a close family friend of his".
The classifier used the words "polite" and "impolite" as labels and produced the probability that the above situation should be classified as polite or impolite. We then computed as a single score the difference between the classification score of the positive class of the norm (e.g., polite) and the classification score of the negative class of the norm (e.g., impolite). This procedure resulted in ten scores corresponding with the social norms (i.e., social norms scores).
We hypothesized that if the zero-shot classifier may successfully identify norm confirmation/violation, situations labeled as proud/grateful, and hypothesized to accompany norm confirmation, should score higher on social norms than situations labeled as embarrassed, ashamed, or guilty.
We further hypothesized that situations labeled as embarrassing, ashamed, or guilty should score below zero, indicating that they score higher in the negative class of the norm, meaning that they involve norm violations.
It must be emphasized that emotions associated with a violation are negative, and those associated with norm adherence are positive. However, not all negative emotions are associated with norm violation, and not all positive emotions are associated with norm adherence. Jealousy, for instance, is a negative emotion not associated with norm violation. This is why we adhere to Sznycer's important work and to a specific and limited set of social emotions involving social valuation vs. devaluation.
To test these hypotheses, and to avoid bias, we first removed situations in which the emotion has been explicitly mentioned in the situation. Overall, we have automatically analyzed N = 8352 situations and for each situation produced ten scores corresponding with each of the ten norms identified above (see Table 1).
The procedure is summarized as follows: www.nature.com/scientificreports/ 1. The input is a list of the labeled situations from the EmpatheticDialogues dataset. Each situation is labeled according to one of the social emotions (e.g., guilt) 2. For situations 1 to N and for social norms 1 to 10 3. Apply the zero-shot classifier for each situation using two labels: One indicating the positive aspect of the norm (e.g., politeness) and the other indicating the negative aspect of the norm (e.g. impoliteness) 4. Compute the difference between the probability of the norm's positive aspect and the norm's negative aspect. 5. The output is the list of labeled situations with 10 scores for each situation, each indicating the extent to which the zero-shot classifier judged them to be "norm adherence" (i.e., positive score) or "norm violation" (i.e., negative score). 6. Use the 10 scores (i.e., features) to test the hypotheses that the positive social emotions (e.g., pride) would score higher than the negative social scores (e.g., guilt).

Results of zero-shot classification in identifying norm confirmation/violation. We used the
Kruskal-Wallis H test to compare the ten social norms scores across situations. For example, Fig. 1 presents the results for the Competence norms: We can see that situations labeled as proud and grateful (i.e., norm confirmation), scored higher than situations labeled with guilt, shame, and embarrassment and that situations hypothesized to indicate norm violation scored below zero, as expected. A Kruskal-Wallis H test showed a statistically significant difference in competence between the different situation types, χ 2 (4) = 5909, p = 0.000. Figures 2, 3, 4, 5, 6, 7, 8 present the results for the other norms: In all of the cases, the null hypothesis has been rejected. Only in one norm (i.e., caring), the scores of the situations hypothesized to signal norm violation were not lower than 0, although the difference between the situations was statistically significant. In addition to competence, results therefore fully support our hypothesis for politeness (χ 2 (4) = 5690, p = 0.000), trust (χ 2 (4) = 5785, p = 0.000), loyalty (χ 2 (4) = 5781, p = 0.000), discipline (χ 2 (4) = 5271 p = 0.000), agreeableness (χ 2 (4) = 5621, p = 0.000), success (χ 2 (4) = 5737, p = 0.000) and decency (χ 2 (4) = 5702, p = 0.000). These results support the hypothesis that the zero-shot classifier can be used for identifying norm confirmation/violation. Automatic rule discovery and norm classification. Given the success of the zero-shot classification in identifying social emotions and norm violation, our next goal was to identify simple rules that may be used for classifying situations involving norm violation/confirmation.

Methods
We grouped situations labeled with guilt, embarrassment, and shame as involving norm violation (Class 1, N = 4731) and situations labeled with pride and gratitude as involving norm confirmation (Class 0, N = 3621). To test the features' predictive power measured through the zero-shot classifier, we have applied the approach 13 used, where first a Symbolic Classification is applied for automatic rule discovery and for identifying a simple mathematical rule for classification. Then this mathematical rule (i.e., model) is used for producing a single feature for a CRT machine learning classifier tested on other datasets. Different ML classifiers can be used, and some of them (e.g., Boosting classification) are known to produce even better results than our chosen classifier. However, we chose CRT for several reasons. First, and as we explain in this paper, we focus on identifying simple predictive models rather than gaining the best results. Second, classification and regression trees are known to www.nature.com/scientificreports/ be simple and powerful 25 , so they are a natural choice. Third, as our approach follows 13 's proposal, and as they also used CRT, we chose this model to allow continuity with their approach. This approach has been inspired by the successful venture in natural science to automatically identify simple rules (i.e., equations) governing the behavior of systems 11,26-28 and the idea of "distilling free-form natural laws from experimental data" (the title of 11 paper). We attempted to identify simple rules for classifying cases involving norm violation. We used HeuristicLab's Optimizer 3.3.16 29 to perform a symbolic classification analysis 28 with a maximum symbolic expression length of 5 and a maximum symbolic tree depth of 5. We have used the following features: (1) The scores of the ten social emotions (see Appendix 1 for the list of emotions), (2) The ten social norms scores, (3) The sentiment score of the situation as measured through the difference between the positive and negative scores produced by RoBERTa sentiment analysis tool 30,31 (variable: "NegativeSentiment"). (4) A new feature, "bad", is calculated as the difference between the classification scores of the two oppositional classes {bad, good}.  By the system's default, 66% of the dataset was used to train the model and the rest for the test. We performed ten folds, each running for 1000 generations, by examining 99,100 solutions. The target class was "1" (i.e., norm violation). Table 2 presents the models' performance in terms of accuracy/precision, the mathematical function identified by the procedure, and the parameters' values.

Results of the automatic rule-discovery and norm classification.
On average, the models gained 94% accuracy and 96% precision. It can be interestingly observed that most models involve the use of social emotions as their features/variables and that the measurement of the norms plays no role in the final models. As models 5 & 8, 9 & 10, and 6 & 7, are not different from each other, and for gaining scientific parsimony, we removed models 7, 8, and 10 from the analysis and remained with seven models only.
For testing the validity of the models and their generality beyond the specific dataset, we followed the approach of 13 for the model's cross-domain validation and used two different massive datasets. It must be emphasized that in this paper, we seek to identify (1) theoretically grounded (2) simple, and (3) interpretable models  www.nature.com/scientificreports/ for the automatic identification of norm violation. Therefore, the tests presented in the next sections do not focus on performance measures per se. In sum, we: • Grouped the situations into two classes: one indicating norm violation and one indicating norm adherence.
• We used a Symbolic Classification procedure to generate simple models (i.e., rules) that classify situations into two classes by using the scores (i.e., features) described in the previous section. • Each model formed through the automatic rule discovery procedure generates a score indicating whether the situation represents norm violation or norm adherence. • Use this single score, produced by each rule, as a feature in another machine-learning model, and test the performance of these rule-generated scores to predict norm adherence/violation.
The major experiments: testing the predictive power of the models. Experiment 1. We first used the Moral Stories 32 dataset to test the previously discovered models. Moral Stories is a "crowd-sourced  www.nature.com/scientificreports/ dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning" (ibid, p. 698). The dataset includes narratives structured as a context (composed out of the norm, the situation, and the intention of the subject), an action that follows a normative path with its consequences, and an action that follows a "divergent path" with its consequences. For example: As normative actions correspond with a social norm and as divergent actions express the violation of the norm, we tested the models previously identified by using them to classify actions labeled as norm confirmation vs. norm violation. As input, we used the action (normative vs. non-normative). Each labeled action was processed according to the procedure previously described, and for each action, a score has been produced by each of the seven models. In other words, we used the zero-shot classifiers, produced measurements for the features, used the models, and for each action produced seven new scores corresponding with each of the models. In other  www.nature.com/scientificreports/ words, the algorithm used the action as the input. It produced seven scores, each indicating whether the action should be classified as involving norm violation or confirmation. Overall we analyzed 12,000 actions labeled "1" (i.e., norm violation) vs. 12,000 labeled "0" (i.e., norm confirmation).
In sum: • The input is an action labeled as "normative" or "non-normative. " • For each action, we apply each of the rule-based models and generate for each model a single score indicating whether the action involves norm adherence or norm violation. • Finally, we used the score produced by each model as a feature in another ML model.
Results of experiment 1. First, we measured the reliability of the models using Cronbach's Alpha. Cronbach's Alpha scored 0.73, indicating that the models produced for the EmpatheticDialogues dataset gain moderate "inter-judge" reliability when applied to another dataset. We used a Classification and Regression Tree decision model (CRT) (Using IBM SPSS), with the ranked score of each model (e.g., Model 1) as the only feature in the CRT machine learning model. In other words, we attempted to predict whether an action involves norm violation by using a single feature (i.e., the score produced by a model), and for prediction/classification, we used the CRT machine learning (ML) model. In addition, for each analysis, we used ten-fold cross-validation. Given the baseline of 50% of the actions labeled as "violation", we hypothesized that if the models/rules automatically discovered are valid, then even a single feature produced through the model may be used to classify the cases beyond the baseline. The results of the analysis are presented in Table 3: We can see that all models improved prediction over the baseline with an average of 68% accuracy, 71% recall, and 68% precision. When using the full set of features as features in the CRT model, we gained 78% accuracy, 81% recall, and 77% precision. When fitting a Backward Binary Logistic Regression model to the data, with the full set of features as input variables, we gained a statistically significant model (χ 2 (18) = 10,541, p = 0.000) with 78% accuracy, 83% recall, and 75% precision, but with 18 features (out of 22) included in the final model. It is quite remarkable that simple models identified through automatic rule discovery and involving only four social emotions, one sentiment measurement, and one score indicating whether the situation is "good" or "bad" may be used for generating a single feature that competes with the results of the Binary Logistic Regression with its model fit and 18 features. Similar results to those of the CRT were gained using a Boosting Classification model with all of the previously mentioned features (e.g., the social emotions scores). The model gained 79% accuracy, 78% precision, and 82% recall. However, we emphasized again that our aim is not to present the best classification performance but to show that simple models using a few theoretically grounded features may substantially improve the automatic identification of norm violation.

Experiment 2.
For a second validation of the models, we used the SOCIAL-CHEM-101 Dataset 33 , where 104 k real-life situations were identified and processed through crowdsourcing. More specifically, we analyzed actions (e.g., run the blender at 5 am) that appear in the dataset and are labeled as "legal" or "illegal". As the percentage of illegal actions was small, we identified all actions labeled as illegal (N = 5934) and matched them with a set of N = 5934 unique actions labeled as legal. For each illegal entry, we matched a unique legal entry by using several parameters that characterize the action. For example, the dataset includes the column of "moral foundations" {i.e., care-harm, fairness-cheating, loyalty-betrayal, authority-subversion, sanctity-degradation}. Therefore, for an illegal action labeled as "care harm" for instance, we matched a legal case with the same category of moral foundation.
As illegal actions indicate that a social norm has been violated, we asked whether the models can be used for classifying an action as illegal (class "1") or legal (class "0"). Each action has been subjected to the analysis of the relevant features, and using the features, the models have been applied to produce seven scores for each action.
In sum: • The input was an action labeled as legal or illegal.
• For each action, we have applied the seven rule-based models and generated a single score indicating whether the action involves norm adherence or norm violation. www.nature.com/scientificreports/ • The scores produced by the rule-based models were used as features for another ML model that classified the action as norm violation or norm adherence.
Results of experiment 2. We first computed the scores produced by the seven models. Their Cronbach's Alpha score was 0.93, indicating that the seven models highly agreed. We gained 82% accuracy, 87% recall, and 80% precision using the scores of the seven models as features in a CRT ML model and the same procedure as applied before. The detailed performance of the seven models appears in Table 4:with an average of 76% accuracy, 77% precision, and 78% recall. Using the full set of features in a CRT model gained 89% accuracy, 87% precision, and 84% recall.
Identifying which norm has been violated. For testing our ability to identify which norm has been violated, we used the SOCIAL-CHEM-101 dataset and cases labeled as illegal. Following Graham et al., (2013), SOCIAL-CHEM-101 labels the situations according to five moral norms. We have identified cases where the situations were labeled only according to one of these norms. It is important to emphasize that our algorithm is generic and can test the violation of different norms. For measuring the norm violations of the Social Chemistry dataset, we have used the norm proposed by the researchers who developed this dataset. Previously, we have used the norms identified through the GPT-3. Our algorithm is general enough to include different social and moral norms. The distribution of these moral norms in SOCIAL-CHEM-101 appears in Table 5: Given the extremely low prevalence of the loyalty norm, it has been removed from the analysis, and the moral norms that we analyzed were care (50.9%), fairness (26.9%), authority (12.8%), and sanctity (9.4%). For identifying which norm has been violated, we used a Boosting Classification ML model with five-fold cross-validation and the full set of features as previously presented (e.g., the norm violation scores).
Results of violation identification. The model performance in identifying the violated norm is presented in Table 6:   We can think about the improvement in prediction in terms of a random guess with the limited knowledge of the norm's prevalence in the dataset; for instance, knowing that the care dimension is approximately 51% of the dataset, we could have randomly guessed in 51% of the cases that the violated norm is the one of caring. In this context, an improvement may be conceived as an improvement in prediction over the baseline.
On average, there was a 34% improvement in prediction. The five features with the highest influence in the model were four of the norm violation measures: (1) trust, (2) caring, (3) loyalty, and (4) conformist, and one social emotion (anger). This improvement in prediction indicates that norm violation may be automatically identified and that the exact type of the violated norm can also be identified using the social norm scores and the social emotion measures as computed through the zero-shoot classification procedure.
Ethical approval. All methods were carried out by relevant guidelines and regulations.

Discussion
Social norms are either descriptive, representing the prevalence of a certain behavior (e.g., avoiding tax payment), or injunctive, representing the extent to which the behavior is approved by a relevant reference group. As social animals, human beings are specifically sensitive to the valuation of others and hence to injunctive norms and their violation, which is accompanied by social emotions such as guilt (e.g., 34 ). While social norms may be culturally specific and cover numerous informal "rules", how people respond to norm violation through evolutionary-grounded social emotions may be much more general and provide us with cues for the automatic identification of norm violation.
In this paper, we have developed and tested several models for automatically identifying the violation of social norms. One would hardly find papers dealing with the automatic identification of social norm violations. Those focusing on norm violation usually concern norms of interaction in specific communities such as Reddit 35,36 . This scarcity of research in identifying and recognizing social norm violations points to the fact that the automatic identification of social norm violations is an open challenge.
We have shown that through (1) the measurement of social emotions and social norms in textual data (2) zero-shot classification, (3) the use of the measured features in the automatic-rule discovery algorithm, and (4) the use of the discovered rules for generating simple features used in a Decision Tree model, may provide substantial improvement in the prediction of social norm violation. Moreover, using GPT-3, and domain expertise, we were able to identify top-level categories of social norms. Using the top-level categories of social norms, we were able to correctly identify the exact norm that has been violated. However, the number of social norms that we tested in the dataset was limited. The paper is, therefore another instance in which modern large language models, such as GPT-3, as combined with the domain expertise of a discipline (e.g., psychology), may advance research in psychology, the social sciences, and the humanities 37 . Our paper presents some preliminary results but will be developed in future studies to include the identification of norm violation in conversation, the use of multimodality for social norm violation, and the use of large language models to identify culturally specific norms.
Our study is limited to developing tools for identifying the violation of "general" social norms. However, the granularity level of norms may span from social groups to the individual level. In this context, it was argued that "people have preferences for following their 'personal norms' what they believe to be the right thing to do" 38, p. 2 and that personal norms may be a powerful explanatory idea in understanding human unselfish behavior. This idea may be tested using tools of computational personality analysis 39 where relatively stable patterns of thought, emotion, and behavior (i.e., personality) may be extended to include personal norms and the tendency to follow them. For instance, the representation of others (i.e., beliefs about others), is considered to be a major dimension in understanding human personality and it has been measured through novel tools of AI for a better understanding of fictional characters 40 . In fact, and in the context of personal norms 38 , suggest that AI may be used "to better navigate the complex landscape of human morality and to better emulate human decision-making" (ibid, p. 11) even in contexts governed in the past by different methodologies such as the behavioral economics. One may hypothesize, for instance, that a person holding negative paranoid beliefs of others and following conspiracy theories may be less prone to express one-shot anonymous unselfishness when he considers his interlocutor to belong to out-group players. It is possible to measure the variability of norms using AI and language-based models tools. As explained by 41, p. 2 "what matters is not just the monetary payoffs associated with actions, but also how these actions are described". For example, automatically analyzing a massive amount of textual data following Hurricane Catarina could have tested whether cooperative actions are described differently by people with different moral norms. It could have been hypothesized that those with more negative beliefs about other out-group members are more inclined to hold different norms. This hypothesis aligns with 42 findings. It may suggest that those individuals may also describe cooperative actions in a less favorable language and may be more inclined to behave less cooperatively (e.g. fewer donations for charity outside their reference group). In addition, our paper is limited by the inevitable choice of datasets, theoretical approaches, and ML models. Therefore, the results presented here are preliminary and should be modestly limited to the specific context of this study.